FoodInsSeg Dataset Documentation

Overview

    FoodInsSeg contains 7,118 food images with instance segmentation masks and  categorical labels for 103 food ingredients.  Total of 119,048 masks - 82,716 for training, 36,332 for testing.

Data Sources & Construction

    Images sourced from FoodSeg103 dataset train and test sets.
    Instance segmentation applied using InsSAM-Tool and FoodSeg103 semantic masks.
    Additional manual mask merging, deletion, re-labeling on test set for quality.

Accessing the Dataset

    Dataset available for download at: 
    https://laura990501.github.io/FoodInsSeg_dataset/

Dataset Structure

    FoodInsSeg
    -- images
       |-- train
       |   |-- 00000000.jpg
       |   |-- 00000001.jpg
       |   |-- ...  
       |-- test
       |   |-- 00000048.jpg
       |   |-- 00000263.jpg
       |   |-- ...
    -- annotations
       |-- Train.json
       |-- Test.json

Annoatation Format
    The test.json and train.json annotation files contain five fields: "info", "licenses",  "annotations", "images", and "categories". 
    Specifically, the "annotations" field stores  instance mask information in polygon format, including mask id, image id, polygon vertices, etc. 
    The "images" field stores image id, width, height, image name, etc. 
    The "categories" field stores the category id and corresponding category name for each class. For more details on the dataset format, please refer to the official COCO dataset documentation.

License
    Dataset is made available under CC BY-NC-SA 4.0 license
    https://creativecommons.org/licenses/by-nc-sa/4.0/

Author Statement
    The authors bear all responsibility in case of violation of rights and confirm the dataset is released under the specified CC BY-NC-SA 4.0 license.



